SITE LINK

Performance of ChatGPT on the Korean National Examination for Dental Hygienists

치위생과학회지 2024년 24권 1호 p.62 ~ 70

배수명, 전혜림, 김경남, 곽선희, 이효진,

소속 상세정보

배수명 ( Bae Soo-Myoung ) -
전혜림 ( Jeon Hye-Rim ) -
김경남 ( Kim Gyoung-Nam ) -
곽선희 ( Kwak Seon-Hui ) - Gangneung-Wonju National University College of Dentistry Department of Dental Hygiene
이효진 ( Lee Hyo-Jin ) -

KMID : 1023420240240010062 DOI : 10.17135/jdhs.2024.24.1.62

Abstract

Background: This study aimed to evaluate ChatGPT’s performance accuracy in responding to questions from the national dental hygienist examination. Moreover, through an analysis of ChatGPT’s incorrect responses, this research intended to pinpoint the predominant types of errors.

Methods: To evaluate ChatGPT-3.5’s performance according to the type of national examination questions, the researchers classified 200 questions of the 49th National Dental Hygienist Examination into recall, interpretation, and solving type questions. The researchers strategically modified the questions to counteract potential misunderstandings from implied meanings or technical terminology in Korea. To assess ChatGPT-3.5’s problem-solving capabilities in applying previously acquired knowledge, the questions were first converted to subjective type. If ChatGPT-3.5 generated an incorrect response, an original multiple-choice framework was provided again. Two hundred questions were input into ChatGPT-3.5 and the generated responses were analyzed. After using ChatGPT, the accuracy of each response was evaluated by researchers according to the types of questions, and the types of incorrect responses were categorized (logical, information, and statistical errors). Finally, hallucination was evaluated when ChatGPT provided misleading information by answering something that was not true as if it were true.

Results: ChatGPT’s responses to the national examination were 45.5% accurate. Accuracy by question type was 60.3% for recall and 13.0% for problem-solving type questions. The accuracy rate for the subjective solving questions was 13.0%, while the accuracy for the objective questions increased to 43.5%. The most common types of incorrect responses were logical errors 65.1% of all. Of the total 102 incorrectly answered questions, 100 were categorized as hallucinations.
Conclusion: ChatGPT-3.5 was found to be limited in its ability to provide evidence-based correct responses to the Korean national dental hygiene examination. Therefore, dental hygienists in the education or clinical fields should be careful to use artificial intelligence-generated materials with a critical view.

키워드

Artificial intelligence; ChatGPT; Dental hygiene; Large language models; National examination

원문 및 링크아웃 정보

등재저널 정보

site infomation

주관기관
서울특별시 성동구 광나루로 257 대한치과의사협회회관
E-mail : kads9189@naver.com/ 02-2024-9189

운영기관
충북 청주시 서원구 충대로 1 충북대학교 N4 의학정보센터
E-mail : medric@medric.or.kr / 043-261-3460

현황(현재기준)

국내논문44,293건(82 저널)
해외논문6,364건